Model Selection

Memory-efficient inference

# Memory-efficient inference

Qwen3 30B A6B 16 Extreme GGUF

An ultra-low bit quantization model generated based on Qwen/Qwen3-30B-A3B-Base, supporting a 32k context length and suitable for various hardware environments

Large Language Model

phi-2 is a text generation model employing IQ-DynamicGate ultra-low bit quantization (1-2 bits), suitable for natural language processing and code generation tasks.

Large Language Model Supports Multiple Languages

Granite 3.3 8b Instruct GGUF

Ultra-low-bit quantization (1-2 bits) language model using IQ-DynamicGate technology, suitable for memory-constrained environments

Large Language Model

Ultra-low-bit quantization (1-2 bit) large language model using IQ-DynamicGate technology, supporting multilingual text generation tasks

Large Language Model English

Olympiccoder 32B GGUF

OlympicCoder-32B is a code generation model based on Qwen2.5-Coder-32B-Instruct, employing IQ-DynamicGate ultra-low-bit quantization technology for efficient inference in memory-constrained environments.

Large Language Model English

EXAONE Deep 32B GGUF

EXAONE-Deep-32B is a 32B-parameter large language model supporting English and Korean, specifically designed for text generation tasks.

Large Language Model Supports Multiple Languages

EXAONE Deep 7.8B GGUF

A 7.8B-parameter model featuring ultra-low-bit quantization (1-2 bits) using IQ-DynamicGate technology, supporting English and Korean text generation tasks.

Large Language Model Supports Multiple Languages

Qwen2.5 14B Instruct 1M GGUF

Qwen2.5-14B-Instruct-1M is an instruction-tuned model based on Qwen2.5-14B, supporting text generation tasks and suitable for chat scenarios.

Large Language Model English

Featured Recommended AI Models

AIbase

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご

© 2025AIbase